Mining top-K frequent itemsets through progressive sampling
نویسندگان
چکیده
منابع مشابه
Efficient Incremental Mining of Top-K Frequent Closed Itemsets
In this work we study the mining of top-K frequent closed itemsets, a recently proposed variant of the classical problem of mining frequent closed itemsets where the support threshold is chosen as the maximum value sufficient to guarantee that the itemsets returned in output be at least K. We discuss the effectiveness of parameter K in controlling the output size and develop an efficient algori...
متن کاملTop-k-FCI: Mining Top-K Frequent Closed Itemsets in Data Streams
With the generation and analysis of stream data, such as network monitoring in real time, log records, click streams, a great deal of attention has been concerned on data streams mining in the field of data mining. In the process of the data streams mining, it is more reasonable to ask users to set a bound on the result size. Therefore, in this paper, an real-time single-pass algorithm, called ...
متن کاملEfficient Frequent Itemsets Mining by Sampling
As the first stage for discovering association rules, frequent itemsets mining is an important challenging task for large databases. Sampling provides an efficient way to get approximating answers in much shorter time. Based on the characteristics of frequent itemsets counting, a new bound for sampling is proposed, with which less samples are necessary to achieve the required accuracy and the e...
متن کاملMining Top-k Frequent Closed Itemsets in Data Streams Using Sliding Window
Frequent itemset mining has become a popular research area in data mining community since the last few years. There are two main technical hitches while finding frequent itemsets. First, to provide an appropriate minimum support value to start and user need to tune this minimum support value by running the algorithm again and again. Secondly, generated frequent itemsets are mostly numerous and ...
متن کاملAn Asymptotically Tighter Bound on Sampling for Frequent Itemsets Mining
In this paper we present a new error bound on sampling algorithms for frequent itemsets mining. We show that the new bound is asymptotically tighter than the state-of-art bounds, i.e., given the chosen samples, for small enough error probability, the new error bound is roughly half of the existing bounds. Based on the new bound, we give a new approximation algorithm, which is much simpler compa...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Data Mining and Knowledge Discovery
سال: 2010
ISSN: 1384-5810,1573-756X
DOI: 10.1007/s10618-010-0185-7